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Abstract 

The derivation trees of a tree adjoining grammar provide a first insight 
into the sentence semantics, and are thus prime targets for generation sys- 
tems. We define a formalism, feature-based regular tree grammars, and a 
translation from feature based tree adjoining grammars into this new for- 
malism. The translation preserves the derivation structures of the original 
grammar, and accounts for feature unification. 



1 Introduction 



Each sentence derivation in a tree adjoining grammar (IJoshi and Schabes 1997 
TAG) results in two parse trees: a derived tree (Figure la I, that represents 
the phrase structure of the sentence, and a derivation tree (Figure lb I, that 



records how the elementary trees of the grammar were combined. Each type of 
parse tree is better suited for a different set of language processing tasks: the 
derived tree is closely related to the lexical elements of the sentence, and the 



derivation tree offers a first insight into the sentence semantics (Candito and 



Kahane 19981. Furthermore, the derivation tree language of a TAG, being a 



regular tree language, is much simpler to manipulate than the corresponding 
derived tree language. 
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Figure 1: Parse trees for "One of the cats has caught a fish." using the grammar 
of |Figure"2l 
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one the has a 



Figure 2: A feature-based tree adjoining grammar. For the sake of clarity, we 
identify elementary trees with their anchors in our examples. 



Derivation trees are thus the cornerstone of several approaches to sentence 



generation (Roller and Striegnitz 2002 Roller and Stone 2007 1, that rely cru 



cially on the ease of encoding regular tree grammars, as dependency grammars 
and planning problems respectively. Derivation trees also serve as intermedi- 
ate representations from which both derived trees (and thus the linear order 
information) and semantics can be computed, e.g. with the abstract catego- 
rial grammars of de Groote (20021, Pogodalla ( 2004[ ), and Ranazawa (2007), or 
similarly with the bimorphisms of 'Shieber (2006). 



Nevertheless, these results do not directly apply to many real- world gram- 



mars, which are expressed in a feature-based variant of TAGs ( Vijay-Shanker 



1992). Each elementary tree node of these grammars carries two feature struc- 



tures that constrain the allowed substitution or adjunction operations at this 
node (see for instance Figure 2 1. In theory, such structures are unproblematic. 



because the possible feature values are drawn from finite domains, and thus the 
number of grammar categories could be increased in order to account for all 
the possible structures. In practice, the sheer number of structures precludes 
such a naive implementation: for instance, the 50 features used in the XTAG 



English grammar (XTAG Research Group 2001 1 together define a domain con- 
taining more than 10^^ different structures. Furthermore, finiteness does not 
hold for some grammars, for instance with the semantic features of |Gardent] 
[and Ralli^IeyCT| ( [2003| ). 

Ignoring feature structures typically results in massive over-generation in 
derivation-centric systems. We define a formalism, feature-based regular tree 
grammars, that produces derivation trees that account for the feature structures 
found in a tree adjoining grammar. In more details. 



we recall how to generate the derivation trees of a tree adjoining grammar 



through a regular tree grammar ( Section 2 1 , then 



we define feature-based regular tree grammars and present the translation 



from feature-based TAG ( Section 3 ) ; finally. 



we provide an improved translation inspired by left corner transformations 



(Section 4 1 



We assume the reader is familiar with the theory of tree-adjoining grammars 



(Joshi and Schabes 



feature unification (Robinson 1965) 



1997), regular tree grammars (Comon et al. 2007), and 
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2 Regular Tree Grammars of Derivations 

In this section, we define an encoding of the set of derivation trees of a tree 
adjoining grammar as the language of a regular tree grammar (RTG) . Several 
encodings equivalent to regular tree grammars have been described in the lit- 



erature; we follow here the one of de Groote (20021, but explicitly construct a 
regular tree grammar. 

Formally, a tree adjoining grammar is a tuple (E, N, I, A, S) where E is a 
terminal alphabet, A'^ is a nonterminal alphabet, / is a set of initial trees a, A is 
a set of auxiliary trees f3 and 5 is a distinguished nonterminal from N. We note 
7j. the root node of the elementary tree 7 and Pf the foot node of the auxiliary 
tree /3. Let us denote by 71, . . . ,7„ the active nodes of an elementary tree 7, 
where a substitution or an adjunction can be performed]^ we call n the rank of 
7, denoted by rk(7). We set 71 to be the root node of 7, i.e. 71 = 7^. Finally, 
lab(7i) denotes the label of node 7^. 

Each elementary tree 7 of the TAG will be converted into a single rule 
X — * 7(Fi, . . . , Yn) of our RTG, such that rk(7) = n and each of the Yi symbols 
represents the possible adjunctions or substitutions of node 7,;. We introduce 
accordingly two duplicates Na = {Xa \ X e N} and Ns = {Xs \ X e N}ofN, 
and a nonterminal labeling function defined for any active node 7^ with label 
lab(7i) = X as 

, , I Xa if 7j is an adjunction site , 
nt(7i) ^ ■( ... (1) 

I Xs if 7i is a substitution site 

The grammar rule corresponding to the elementary tree anchored by "one of" 



in Figure 2 is then NPa one oi{NPAT Da, Pa, Na), meaning that this tree 
adjoins into an NP labeled node, and expects adjunctions on its nodes NP^, D, 
P, and N. Given our set of elementary TAG trees, only the first one of these 
four will be useful in a reduced RTG. 

Definition 1. The regular derivation tree grammar G = {Ss,M,J-^R) of a 
TAG (E, N, I, A, S) is a RTG with axiom Ss, nonterminal alphabet TV = iVs U 
Na, terminal alphabet T — I ^ A\J {ea} with ranks rk(7) for elementary trees 
7 in / U A and rank for ea , and with set of rules 

R = {Xs — > Q!(nt(ai), . . . , nt(a„)) \ a ^ I ,n — rk{a),X — lab(Q;r)} 
U {Xa ^ /3(nt(/3i), . . . , nt(/3„)) | /? e A, n = rk(/3), X = lab(/3,)} 
U {Xa £a I Xa e Na] 

□ 

The e-rules Xa — > £a for each symbol Xa account for adjunction sites where 
no adjunction takes place. The RTG has the same size as the original TAG and 
the translation can be computed in linear time. 



^We consider in particular that no adjunction can occur at a foot node. We do not consider 
null adjunctions constraints on root nodes and feature structures on null adjoining nodes, 
which would rather obscure the presentation, and we do not treat other adjunction constraints 
either. 
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Figure 3: Some trees generated by the regular tree grammar of Example 2 



Example 2. The reduced regular tree grammar corresponding to the tree ad- 
joining grammar of [Figure 2| is then: 



{Ss, {Ss, VPs, VP A, NPs, NPa}, 
{one of, the, cats, has, caught, a, fish, ea}, 



{ Ss 


^ caught(7VPs, VPa, NPs 


NPs 


cats(7VPA), 


NPs 


^ fish(ArPA), 


NPa 


-> the{NPA), 


NPa 


^ &{NPa), 


NPa 


one oiiNPA), 


NPa 




VPa 


-*has{VPA), 


VPa 


-^sa}) 



□ 



Let us recall that the derivation relation induced by a regular tree grammar 
G = {Ss,JV,T,R) relates termsj^ of T{T,N), so that t =^ t' holds iff there 
exists a contexlj^ C and a rule A — > a{Bi, . . . , _B„) such that t = C[A\ and t' = 
C[a{Bi,. . . , B„)]. The language of the RTG is L{G) = {t € T{T) \ Ss ^* t}. 

One can check that the grammar of Example"~2] generates trees with a root 



labeled with "caught" , and three subtrees, the leftmost and rightmost of which 
labeled with "cats" or "fish" followed by an arbitrary long combination of nodes 
labeled with "one of" , "a" or "the" . The central subtree is an arbitrary long 
combination of nodes labeled with "has" . Each branch terminates with ea- Two 
of these trees can be seen on [Figure Our RTG generates the derivation trees 
of a version of the original TAG expunged from its feature structures. 



^The set of terms over the alphabet and the set of variables X is denoted by T(JF, X); 
T{T, 0) = T{T) is the set of trees over T. 

context C is a term of T(T, X U {x]), x ^ X, which contains a single occurrence of x. 
The term C[t] for some term t of T{T, X) is obtained by replacing this occurrence by t. 



Feature Unification in TAG Derivation Trees 



5 



3 Unification on TAG Derivation Trees 

3.1 Feature-based Regular Tree Grammars 

In order to extend the previous construction to feature-based TAGs, our RTGs 



use combinations of rewrites and unifications — also dubbed narrowings { Hanus 



1994) — of terms with variables in Af x V, where Af denotes the nonterminal 
alphabet and T> the set of feature structures]^ 

Definition 3. A feature-based regular tree grammar (5*, A/", J-", 2?, i?) comprises 
an axiom S, a set Af of nonterminal symbols that includes S, a ranked terminal 
alphabet JF, a set T) of feature structures, and a set R of rules of form {A, d) — > 
a((i?i, d'l), . . . , d'^), where A, Bi, . . . , are nonterminals, d,d[, . . . , d[^ are 
feature structures, and a is a terminal with rank n. 

The derivation relation ^ for a feature-based RTG G = {S,JV,J-','D,R) re- 
lates pairs of terms from T{!F,JV x V) and u-substitutions, such that (s,e) 
{t,e') iff there exist a context C, a rule {A,d) — > a{{Bi,d[), . . . ,{Bn,d'j^)) in 
R with fresh variables in the feature structures, a structure c?', and an u- 
substitution a verifying 

s = C[iA, d% t ^ C[a((Bi, a(rf'i)), . . . , (i3„, aK)))], 
(J = mgu((i, e{d')) and e' = a o e. 

The language of G is 

L(G) = {ier(^) |3e,((5,T),zd) ^* (i,e)}. 

□ 

Features percolate hierarchically through the computation of the most gen- 
eral unifier mgu at each derivation step, while the global u-substitution e acts 
as an environment that communicates unification results between the branches 
of our terms. 

Feature-based RTGs with a finite domain V are equivalent to regular tree 
grammars. Unrestricted feature-based RTGs can encode Turing machines just 



like unification grammars (Johnson 19881, and thus we can reduce the halting 



problem on the empty input for Turing machines to the emptiness problem for 
feature-based RTGs, which is thereby undecidable. 



3.2 Encoding Feature-based TAGs 

For each tree 7 with rank n, we now create a rule P — s- 7(Pi, . . . , P„). A 
right-hand side pair Ri = (nt(7i),(i^) stands for an active node 7^ with fea- 
ture structure d'^ — feats(7i) = [ t""? ' b°t(^-)] , where top(7i) and bot(7i) denote 
respectively the top and bottom feature structures of 7^. 

The left-hand side pair P = (^4, d) carries the interface d = in(7) of 7 with 
the rest of the grammar, such that d percolates the root top feature, and the 

■^In order to differentiate TAG tree substitutions from term substitutions, we call the latter 
u-substitutions. Given two feature structures d and d' in 15, we denote by the u-substitution 
(T = mgu(d, d') their most general unifier if it exists. We denote by T the most general element 
of "D, and by id the identity. 
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foot bot feature for auxiliary trees. Formally, for each initial tree a in / and 
auxiliary tree (3 in A, using a fresh variable t, we define 



\n{a) 



top 


tap(a^)] 


top 






t 




top{ f3r) 


bot 





(2) 
(3) 



The interface thus uses the top features of the root node of an elementary tree, 
and we have to implement the fact that this top structure is the same as the 
top structure of the variable that embodies the root node in the rule right-hand 
side. With the same variable t, we define accordingly: 



feat(7i) 




if 7i = 7r 
otherwise 



(4) 



Finally, we add e- rules {Xj^, [1°^!^]) — > Ea for each symbol X^. in order to 
account for adjunction sites where no adjunction takes place. Let us denote by 
tr(7i) the pair (nt(7i), feats(7i)). 

Definition 4. The feature-based RTG G = {Ss,Ns U Na,T,V,R) of a TAG 
(S, TV, /, A, S) with feature structures in V has terminal alphabet = / U A U 
{e^} with respective ranks rk(a), rk(/3), and 0, and set of rules 

R — {{Xs^ in(a)) — > a(tr(Q;i), . . . , tr(a„)) | a e /, n = rk{a),X = lab(ar)} 
\J{{Xa, in(/5)) -> /3(tr(/3i), . . . , tr(/3„)) | /3 e A, n = rk(/3), X = lab(/3,)} 

^{XA[Z\l]^eA\XA€NA} ^ 



Example 5. With the grammar of |Figure2| we obtain the following ruleset: 



SsT ■ 

NPs [top : t] . 

NPs ['op ■■ '] ■ 

Ttop : t -| 
r toy : t 1 
r top : t "I 

NPa \'Z\l\ ■ 

'A ^hot : [mode : ppart ] ] 

VP A \'z\:^ 



NPa 

NPa 
NP 



caught yNPs [top ■ Itgr : a 
cats (^NPa [Irf ■ [agr : 3pi ] 

&s}i{NPA['-p-t]) 



the NPa 



0], FPa 

]) 



node : ind J 
lode : ppart ] 



,NPsT 



a I NPa 
one of 



EA 



■ SA 



□ 



With the grammar of Example 5| one can generate the derivation tree for 
"One of the cats has caught a fish." This derivation is presented in [Figure 4] 
Each node of the tree consists of a label and of a pair (t, e) where i is a term 
from T(JF, A/" x V) and e is an environment]^ In order to obtain fresh variables, 
we rename variables from the RTG: we reuse the name of the variable in the 
grammar, prefixed by the Gorn address of the node where the rewrite step 



^Actually, we only write the change in the environment at each point of the derivation. 
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takes place. Labels indicate the chronological order of the narrowings in the 
derivation. 



Labels in Figure 4 suggest that this derivation has been computed with a 
left to right strategy. Of course, other strategies would have led to the same 
result. The important thing to notice here is that the crux of the derivation 
lies in the fifth rewrite step, where the agreement between the subject and the 
verb is realized. Substitutions sites are completely defined when all adjunctions 
in the subtree have been performed. In the next section we propose a different 
translation that overcomes this drawback. 



4 Left Corner Transformation 

Derivations in the previous feature-based RTG are not very predictive: the 
substitution of "cats" into "caught" in the derivation of Figure l"b| does not 



constrain the agreement feature of "caught". This feature is only set at the 
final e-rewrite step after the adjunction of "one of" , when the top and bottom 
features are unified. More generally, given a substitution site, we cannot a priori 
rule out the substitution of most initial trees, because their root does usually 
not carry a top feature. 

A solution to this issue is to compute the derivations in a transformed gram- 
mar, where we start with the e-rewrite, apply the root adjunctions in reverse 
order, and end with the initial tree substitution. Since our encoding sets the 
root adjunct as the leftmost child, this amounts to a selective left corner trans- 
formation (Rosenkrantz and Lewis II 1970 ) of our RTG — an arguably simpler 



intuition than what we could write for the corresponding transformation on 
derived trees. 



4.1 Transformed Regular Tree Grammars 

The transformation involves regular tree grammar rules of form Xg — > a(X^, ...) 
for substitutions, and Xa P{Xa, ■■■) and Xa — > ea for root adjunctions. After 
a reversal of the recursion of root adjunctions, we will first apply the e rewrite 
using a rule Xg £s{X) with rank 1 for £5, followed by the root adjunctions 
X — > ...), and finally the substitution itself X o; (...), with a decremented 
rank for initial trees. 

Example 6. On the grammar of [Figure^ we obtain the rules: 

caught(AfPs, VPa,NPs) 
es{NP) 
cats 
fish 

the(iVP) 
one oi{NP) 
has(VPA) 



Ss ^ 
NPs ^ 

NP 

NP 

NP 

NP 
VP A ^ 
VP A ^ 



Adjunctions that do not occur on the root of an initial tree, like the adjunc- 
tion of "has" in our example, keep their original translation using Xa P{Xa^ ■■■) 
and Xa —* ea rules. We use the nonterminal symbols X of the grammar for 
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root adjunctions and initial trees, and we retain Xs for the initial es rewrite on 
substitution nodes. 

Definition 7. The left-corner transformed RTG Gic = {Ss, JVUTVsUTVa, J"ic, -Ric) 
of a TAG (E, N, /, A, S) has terminal alphabet Tic = I U AU {£a, £s} with re- 
spective ranks rk{a) — 1, rk(/3), 0, and 1, and set of rules 

Ric = {Xs -> es{X) I Xs G Ns} 

U {X — > a{nt{a2), ■ ■ ■ , nt(a„)) \ a ^ I,n — rk{a),X ~ \ab{ar)} 

U {X -> nt(/32) . . . , nt(/3„)) | /3 e A, n = rk(/3), X - lab(/3,)} 

U {X^ -> /3(nt(/3i), . . . , nt(/3„)) | /3 e A, n = rk(/3), X = lab(/?,)} 

U {X^ -> £^ I e Na} ^ 

Due to the duplicated rules for auxiliary trees, the size of the left-corner 
transformed RTG of a TAG is doubled at worst. In practice, the reduced 
grammar witnesses a reasonable growth (10% on the French TAG grammar 
of |Gardent| ( [2006| ). 

The transformation is easily reversed. We define accordingly the function 
Ic"^ from T{Tic) to T{T): 

\c\es{t))=s{t,eA) 
s(/?(ti,t2,...,i„),i) =s(ti,/3(t,/^,(t2),...,/^„(i„))) 

S{a{ti,...,tn),t) = a{t, fa^itl), ■■; fa„+i{tn)) 
a(7(ii, tn)) = 7(/7l fl^ (tn)) 

a{t) if 7i is an adjunction site 
\c^(t) if 7i is a substitution site 



We can therefore generate a derivation tree in L(Gic) and recover the derivation 
tree in L{G) through lc~^. 



4.2 Features in the Transformed Grammar 

Example 8. Applying the same transformation on the feature-based regular 



tree grammar, we obtain the following rules for the grammar of Figure 2 



SsT ^ caught {nPs [""• ■ ["■"^ -l] , VP a [^°^ ■ [™J-~ , NPs^^ 



NPs [fp - *] 

NP [bot : lagr : 3pl]] 



NP 
NP 



agr : 3ag 



VP A [bZ. ': [mode : ppart ] 

VP A 



ss {NP[lZ[l\ 
cats 



NPT fish 









NP 


bat : \co°Z ': +1 


-> the 




L def : +J . 





& NP 



one 
has 



rtap : t 1\ 

of(7vp[r::j.,;:3p:|]) 



□ 
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Since we reversed the recursion of root adjunctions, the feature structures 
on the left-hand side and on the root node of the right-hand side of auxiliary 
rules are swapped in their transformed counterparts (e.g. in the rule for "one 
of"). 

This version of a RTG for our example grammar is arguably much easier to 
read than the one described in [Example 5| a derivation has to go through "one 
of" and "the" before adding "cats" as subject of "caught" . 

The formal translation of a TAG into a transformed feature-based RTG 
requires the following variant trie of the tr function: for any auxiliary tree (3 in 
A and any node 7, of an elementary tree 7 in / U A, and with t a fresh variable 
oiV: 

inic(/3) = [Z'.LiHf)] (5) 



featsic(7i) = ■( L'°'^'>°'(^-)J (6) 
l^feats(7i) otherwise 

tric(7i) = (nt(7j),featsic(7i)) (7) 

Definition 9. The left-corner transformed feature-based RTG G\c = {Ss,NU 
Ns U iV^, ^ic, i?ic) of a TAG (E, iV, /, A, S) with feature structures in V has 
terminal alphabet J-\c — lU A(J {ea, £s} with respective ranks rk{a) — 1, rk(/?), 
0, and 1, and set of rules 

Ric - {Xs : *] -> es{X [IZ ■.:])\XsG Ns} 
U {{X, feats(ai)) -> a(tric(a2), . . . , tric(a„)) 

\ a e I,n= rk{a),X = lab(a,.)} 
U {{X, featsic(/3i)) ^ inic(/3)), trie(/32), ■ • • , tric(/3„)) 

\f3eA,n^rk{f]),X^\ab{(3r)} 
UHXaMP))^ /3(tr(/3i),trie(/32), ■ ■ • ,trie (/?„)) 

\l3e A, n=rk{f3),X^ \ab{Pr)} 

^{Xa[IZ-.1] ^eAlXA^NA} 

□ 

Again, the translation can be computed in linear time, and results in a 
grammar with at worst twice the size of the original TAG. 



5 Conclusion 

We have introduced in this paper feature-based regular tree grammars as an ad- 
equate representation for the derivation language of large coverage TAG gram- 
mars. Unlike the restricted unification computations on the derivation tree 



considered before by 'Kallmcyer and Romero (2004 1, feature-based RTGs ac- 



curately translate the full range of unification mechanisms employed in TAGs. 
Moreover, left-corner transformed grammars make derivations more predictable, 
thus avoiding some backtracking in top-down generation. 

Among the potential applications of our results, let us further mention more 
accurate reachability computations between elementary trees, needed for in- 
stance in order to check whether a TAG complies with the tree insertion gram- 



mar (Schabes and Waters 1995[ TIG) or regular form (Rogers 1994 RFTAG) 
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conditions. In fact, among the formal checks one might wish to perform on 
grammars, many rely on the availability of reachability relations. 

Let us finally note that we could consider the string language of a TAG 
encoded as a feature-based RTG — in a parser for instance — , if we extended the 
model with topological information, in the line of Kuhlmann (20071. 
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